3,898 research outputs found

    Hierarchical word clustering - automatic thesaurus generation

    Get PDF
    In this paper, we propose a hierarchical, lexical clustering neural network algorithm that automatically generates a thesaurus (synonym abstraction) using purely stochastic information derived from unstructured text corpora and requiring no prior word classifications. The lexical hierarchy overcomes the Vocabulary Problem by accommodating paraphrasing through using synonym clusters and overcomes Information Overload by focusing search within cohesive clusters. We describe existing word categorisation methodologies, identifying their respective strengths and weaknesses and evaluate our proposed approach against an existing neural approach using a benchmark statistical approach and a human generated thesaurus for comparison. We also evaluate our word context vector generation methodology against two similar approaches to investigate the effect of word vector dimensionality and the effect of the number of words in the context window on the quality of word clusters produced. We demonstrate the effectiveness of our approach and its superiority to existing techniques. (C) 2002 Elsevier Science B.V. All rights reserved

    A high performance k-NN approach using binary neural networks

    Get PDF
    This paper evaluates a novel k-nearest neighbour (k-NN) classifier built from binary neural networks. The binary neural approach uses robust encoding to map standard ordinal, categorical and numeric data sets onto a binary neural network. The binary neural network uses high speed pattern matching to recall a candidate set of matching records, which are then processed by a conventional k-NN approach to determine the k-best matches. We compare various configurations of the binary approach to a conventional approach for memory overheads, training speed, retrieval speed and retrieval accuracy. We demonstrate the superior performance with respect to speed and memory requirements of the binary approach compared to the standard approach and we pinpoint the optimal configurations. (C) 2003 Elsevier Ltd. All rights reserved

    Improved AURA k-Nearest Neighbour approach

    Get PDF
    The k-Nearest Neighbour (kNN) approach is a widely-used technique for pattern classification. Ranked distance measurements to a known sample set determine the classification of unknown samples. Though effective, kNN, like most classification methods does not scale well with increased sample size. This is due to their being a relationship between the unknown query and every other sample in the data space. In order to make this operation scalable, we apply AURA to the kNN problem. AURA is a highly-scalable associative-memory based binary neural-network intended for high-speed approximate search and match operations on large unstructured datasets. Previous work has seen AURA methods applied to this problem as a scalable, but approximate kNN classifier. This paper continues this work by using AURA in conjunction with kernel-based input vectors, in order to create a fast scalable kNN classifier, whilst improving recall accuracy to levels similar to standard kNN implementations

    Measurement of statistical evidence on an absolute scale following thermodynamic principles

    Full text link
    Statistical analysis is used throughout biomedical research and elsewhere to assess strength of evidence. We have previously argued that typical outcome statistics (including p-values and maximum likelihood ratios) have poor measure-theoretic properties: they can erroneously indicate decreasing evidence as data supporting an hypothesis accumulate; and they are not amenable to calibration, necessary for meaningful comparison of evidence across different study designs, data types, and levels of analysis. We have also previously proposed that thermodynamic theory, which allowed for the first time derivation of an absolute measurement scale for temperature (T), could be used to derive an absolute scale for evidence (E). Here we present a novel thermodynamically-based framework in which measurement of E on an absolute scale, for which "one degree" always means the same thing, becomes possible for the first time. The new framework invites us to think about statistical analyses in terms of the flow of (evidential) information, placing this work in the context of a growing literature on connections among physics, information theory, and statistics.Comment: Final version of manuscript as published in Theory in Biosciences (2013

    On Lorentz Invariance, Spin-Charge Separation And SU(2) Yang-Mills Theory

    Full text link
    Previously it has been shown that in spin-charge separated SU(2) Yang-Mills theory Lorentz invariance can become broken by a one-cocycle that appears in the Lorentz boosts. Here we study in detail the structure of this one-cocycle. In particular we show that its non-triviality relates to the presence of a (Dirac) magnetic monopole bundle. We also explicitely present the finite version of the cocycle.Comment: 4 page

    Hierarchical growing neural gas

    Get PDF
    “The original publication is available at www.springerlink.com”. Copyright Springer.This paper describes TreeGNG, a top-down unsupervised learning method that produces hierarchical classification schemes. TreeGNG is an extension to the Growing Neural Gas algorithm that maintains a time history of the learned topological mapping. TreeGNG is able to correct poor decisions made during the early phases of the construction of the tree, and provides the novel ability to influence the general shape and form of the learned hierarchy

    Observations of Stellar Objects at a Shell Boundary in the Star-Forming Complex in the Galaxy IC1613

    Get PDF
    The single region of ongoing star formation in the galaxy IC 1613 has been observed in order to reveal the nature of compact emission-line objects at the edges of two shells in the complex, identified earlier in H-alpha line images. The continuum images show these compact objects to be stars. Detailed spectroscopic observations of these stars and the surrounding nebulae were carried out with an integral field spectrograph MPFS mounted on the 6m telescope of the Special Astrophysical Observatory. The resulting stellar spectra were used to determine the spectral types and luminosity classes of the objects. An Of star we identified is the only object of this spectral type in IC 1613. The results of optical observations of the multi-shell complex are compared to 21cm radio observations. The shells harboring the stars at their boundaries constitute the most active part of the star-forming region. There is evidence that shocks have played an important role in the formation of the shells.Comment: 10 pages, 5 PS and 1 color JPEG figur

    The Biot-Savart operator and electrodynamics on subdomains of the three-sphere

    Full text link
    We study steady-state magnetic fields in the geometric setting of positive curvature on subdomains of the three-dimensional sphere. By generalizing the Biot-Savart law to an integral operator BS acting on all vector fields, we show that electrodynamics in such a setting behaves rather similarly to Euclidean electrodynamics. For instance, for current J and magnetic field BS(J), we show that Maxwell's equations naturally hold. In all instances, the formulas we give are geometrically meaningful: they are preserved by orientation-preserving isometries of the three-sphere. This article describes several properties of BS: we show it is self-adjoint, bounded, and extends to a compact operator on a Hilbert space. For vector fields that act like currents, we prove the curl operator is a left inverse to BS; thus the Biot-Savart operator is important in the study of curl eigenvalues, with applications to energy-minimization problems in geometry and physics. We conclude with two examples, which indicate our bounds are typically within an order of magnitude of being sharp.Comment: 24 pages (was 28 pages) Revised to include a new introduction, a detailed example, and results about helicity; other changes for readabilit
    • …
    corecore